feat(model): add quantization support for LLM2Vec text encoder by Lee-Jun-Hyuk-37 · Pull Request #12 · nv-tlabs/kimodo

Lee-Jun-Hyuk-37 · 2026-03-26T11:43:56Z

Thank you for this excellent project.

Summary

Add KIMODO_QUANTIZE env var to load the Llama-3-8B text encoder with reduced precision via bitsandbytes
Supported modes: 4bit (NF4, ~5GB VRAM), 8bit (INT8, ~9GB VRAM)
Quantized models are pinned to their device to avoid errors from .to() calls on quantized weights

Motivation

Kimodo currently requires ~17GB VRAM, which limits it to high-end GPUs (A100, RTX 3090/4090). Many consumer GPUs have 8-12GB VRAM, which is enough for the diffusion model (~1GB) but not for the full-precision text encoder (~16GB).

This change lets users trade a small amount of text embedding quality for significantly lower VRAM usage, making Kimodo accessible on a much wider range of hardware.

Usage

KIMODO_QUANTIZE=4bit kimodo_gen "A person walks forward." --output motion
KIMODO_QUANTIZE=8bit kimodo_gen "A person walks forward." --output motion

Requires: pip install bitsandbytes accelerate

Add KIMODO_QUANTIZE env var to load the Llama-3-8B text encoder with reduced precision via bitsandbytes: KIMODO_QUANTIZE=4bit - NF4 4-bit (~5GB VRAM, down from ~17GB) KIMODO_QUANTIZE=8bit - INT8 8-bit (~9GB VRAM) This makes Kimodo usable on consumer GPUs (8-12GB) while retaining full text-prompt support. The quantized model is pinned to its device to avoid errors from .to() calls on quantized weights. Requires: pip install bitsandbytes accelerate

davrempe mentioned this pull request Apr 25, 2026

RTX 5080 (Blackwell) + 16GB VRAM: Kimodo fails with CUDA OOM due to LLM fallback loading on GPU even when CPU mode is requested #27

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(model): add quantization support for LLM2Vec text encoder#12

feat(model): add quantization support for LLM2Vec text encoder#12
Lee-Jun-Hyuk-37 wants to merge 1 commit intonv-tlabs:mainfrom
Lee-Jun-Hyuk-37:feat/llm2vec-quantization

Lee-Jun-Hyuk-37 commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lee-Jun-Hyuk-37 commented Mar 26, 2026

Summary

Motivation

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant